[Chapter Ten][Previous]
[Next] [Art of
Assembly][Randall Hyde]
Art of Assembly: Chapter Ten
- 10.9 - Nested Statements
- 10.10 - Timing Delay Loops
10.9 Nested Statements
As long as you stick to the templates provides in the examples presented
in this chapter, it is very easy to nest statements inside one another.
The secret to making sure your assembly language sequences nest well is
to ensure that each construct has one entry point and one exit point. If
this is the case, then you will find it easy to combine statements. All
of the statements discussed in this chapter follow this rule.
Perhaps the most commonly nested statements are the if..then..else statements.
To see how easy it is to nest these statements in assembly language, consider
the following Pascal code:
if (x = y) then
if (I >= J) then writeln('At point 1')
else writeln('At point 2)
else write('Error condition');
To convert this nested if..then..else to assembly language, start with the
outermost if, convert it to assembly, then work on the innermost if:
; if (x = y) then
mov ax, X
cmp ax, Y
jne Else0
; Put innermost IF here
jmp IfDone0
; Else write('Error condition');
Else0: print
byte "Error condition",0
IfDone0:
As you can see, the above code handles the "if (X=Y)..." instruction,
leaving a spot for the second if. Now add in the second if as follows:
; if (x = y) then
mov ax, X
cmp ax, Y
jne Else0
; IF ( I >= J) then writeln('At point 1')
mov ax, I
cmp ax, J
jnge Else1
print
byte "At point 1",cr,lf,0
jmp IfDone1
; Else writeln ('At point 2');
Else1: print
byte "At point 2",cr,lf,0
IfDone1:
jmp IfDone0
; Else write('Error condition');
Else0: print
byte "Error condition",0
IfDone0:
The nested if
appears in italics above just to help it stand
out.
There is an obvious optimization which you do not really want to make until
speed becomes a real problem. Note in the innermost if
statement
above that the JMP IFDONE1
instructions simply jumps to a jmp
instruction which transfers control to IfDone0
. It is
very tempting to replace the first jmp
by one which jumps directly
to IFDone0
. Indeed, when you go in and optimize your code,
this would be a good optimization to make. However, you shouldn't make such
optimizations to your code unless you really need the speed. Doing so makes
your code harder to read and understand. Remember, we would like all our
control structures to have one entry and one exit. Changing this jump as
described would give the innermost if
statement two exit points.
The for
loop is another commonly nested control structure.
Once again, the key to building up nested structures is to construct the
outside object first and fill in the inner members afterwards. As an example,
consider the following nested for
loops which add the elements
of a pair of two dimensional arrays together:
for i := 0 to 7 do
for k := 0 to 7 do
A [i,j] := B [i,j] + C [i,j];
As before, begin by constructing the outermost loop first. This code assumes
that dx will be the loop control variable for the outermost loop (that is,
dx is equivalent to "i"):
; for dx := 0 to 7 do
mov dx, 0
ForLp0: cmp dx, 7
jnle EndFor0
; Put innermost FOR loop here
inc dx
jmp ForLp0
EndFor0:
Now add the code for the nested for loop. Note the use of the cx register
for the loop control variable on the innermost for loop of this code.
; for dx := 0 to 7 do
mov dx, 0
ForLp0: cmp dx, 7
jnle EndFor0
; for cx := 0 to 7 do
mov cx, 0
ForLp1: cmp cx, 7
jnle EndFor1
; Put code for A[dx,cx] := b[dx,cx] + C [dx,cx] here
inc cx
jmp ForLp1
EndFor1:
inc dx
jmp ForLp0
EndFor0:
Once again the innermost for
loop is in italics in the above
code to make it stand out. The final step is to add the code which performs
that actual computation.
10.10 Timing Delay Loops
Most of the time the computer runs too slow for most people's tastes.
However, there are occasions when it actually runs too fast. One common
solution is to create an empty loop to waste a small amount of time. In
Pascal you will commonly see loops like:
for i := 1 to 10000 do ;
In assembly, you might see a comparable loop:
mov cx, 8000h
DelayLp: loop DelayLp
By carefully choosing the number of iterations, you can obtain a relatively
accurate delay interval. There is, however, one catch. That relatively accurate
delay interval is only going to be accurate on your machine. If you move
your program to a different machine with a different CPU, clock speed, number
of wait states, different sized cache, or half a dozen other features, you
will find that your delay loop takes a completely different amount of time.
Since there is better than a hundred to one difference in speed between
the high end and low end PCs today, it should come as no surprise that the
loop above will execute 100 times faster on some machines than on others.
The fact that one CPU runs 100 times faster than another does not reduce
the need to have a delay loop which executes some fixed amount of time.
Indeed, it makes the problem that much more important. Fortunately, the
PC provides a hardware based timer which operates at the same speed regardless
of the CPU speed. This timer maintains the time of day for the operating
system, so it's very important that it run at the same speed whether you're
on an 8088 or a Pentium. In the chapter on interrupts you will learn to
actually patch into this device to perform various tasks. For now, we will
simply take advantage of the fact that this timer chip forces the CPU to
increment a 32-bit memory location (40:6ch) about 18.2 times per second.
By looking at this variable we can determine the speed of the CPU and adjust
the count value for an empty loop accordingly.
The basic idea of the following code is to watch the BIOS timer variable
until it changes. Once it changes, start counting the number of iterations
through some sort of loop until the BIOS timer variable changes again. Having
noted the number of iterations, if you execute a similar loop the same number
of times it should require about 1/18.2 seconds to execute.
The following program demonstrates how to create such a Delay
routine:
.xlist
include stdlib.a
includelib stdlib.lib
.list
; PPI_B is the I/O address of the keyboard/speaker control
; port. This program accesses it simply to introduce a
; large number of wait states on faster machines. Since the
; PPI (Programmable Peripheral Interface) chip runs at about
; the same speed on all PCs, accessing this chip slows most
; machines down to within a factor of two of the slower
; machines.
PPI_B equ 61h
; RTC is the address of the BIOS timer variable (40:6ch).
; The BIOS timer interrupt code increments this 32-bit
; location about every 55 ms (1/18.2 seconds). The code
; which initializes everything for the Delay routine
; reads this location to determine when 1/18th seconds
; have passed.
RTC textequ <es:[6ch]>
dseg segment para public 'data'
; TimedValue contains the number of iterations the delay
; loop must repeat in order to waste 1/18.2 seconds.
TimedValue word 0
; RTC2 is a dummy variable used by the Delay routine to
; simulate accessing a BIOS variable.
RTC2 word 0
dseg ends
cseg segment para public 'code'
assume cs:cseg, ds:dseg
; Main program which tests out the DELAY subroutine.
Main proc
mov ax, dseg
mov ds, ax
print
byte "Delay test routine",cr,lf,0
; Okay, let's see how long it takes to count down 1/18th
; of a second. First, point ES as segment 40h in memory.
; The BIOS variables are all in segment 40h.
;
; This code begins by reading the memory timer variable
; and waiting until it changes. Once it changes we can
; begin timing until the next change occurs. That will
; give us 1/18.2 seconds. We cannot start timing right
; away because we might be in the middle of a 1/18.2
; second period.
mov ax, 40h
mov es, ax
mov ax, RTC
RTCMustChange: cmp ax, RTC
je RTCMustChange
; Okay, begin timing the number of iterations it takes
; for an 18th of a second to pass. Note that this
; code must be very similar to the code in the Delay
; routine.
mov cx, 0
mov si, RTC
mov dx, PPI_B
TimeRTC: mov bx, 10
DelayLp: in al, dx
dec bx
jne DelayLp
cmp si, RTC
loope TimeRTC
neg cx ;CX counted down!
mov TimedValue, cx ;Save away
mov ax, ds
mov es, ax
printf
byte "TimedValue = %d",cr,lf
byte "Press any key to continue",cr,lf
byte "This will begin a delay of five "
byte "seconds",cr,lf,0
dword TimedValue
getc
mov cx, 90
DelayIt: call Delay18
loop DelayIt
Quit: ExitPgm ;DOS macro to quit program.
Main endp
; Delay18-This routine delays for approximately 1/18th sec.
; Presumably, the variable "TimedValue" in DS has
; been initialized with an appropriate count down
; value before calling this code.
Delay18 proc near
push ds
push es
push ax
push bx
push cx
push dx
push si
mov ax, dseg
mov es, ax
mov ds, ax
; The following code contains two loops. The inside
; nested loop repeats 10 times. The outside loop
; repeats the number of times determined to waste
; 1/18.2 seconds. This loop accesses the hardware
; port "PPI_B" in order to introduce many wait states
; on the faster processors. This helps even out the
; timings on very fast machines by slowing them down.
; Note that accessing PPI_B is only done to introduce
; these wait states, the data read is of no interest
; to this code.
;
; Note the similarity of this code to the code in the
; main program which initializes the TimedValue variable.
mov cx, TimedValue
mov si, es:RTC2
mov dx, PPI_B
TimeRTC: mov bx, 10
DelayLp: in al, dx
dec bx
jne DelayLp
cmp si, es:RTC2
loope TimeRTC
pop si
pop dx
pop cx
pop bx
pop ax
pop es
pop ds
ret
Delay18 endp
cseg ends
sseg segment para stack 'stack'
stk word 1024 dup (0)
sseg ends
end Main
- 10.9 - Nested Statements
- 10.10 - Timing Delay Loops
Art of Assembly: Chapter Ten - 27 SEP 1996
[Chapter Ten][Previous]
[Next] [Art of
Assembly][Randall Hyde]